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A study is made of coding and decoding systems for a continuous channel 
with an additive gaussian noise and subject to an average power limitation 
at the transmitter. Upper and lower bounds are found for the error prob- 
ability in decoding with optimal codes and decoding systems. These bounds 
are close together for signaling rates near channel capacity and also for sig- 
naling rates near zero, but diverge between. Curves exhibiting these bounds 
are given. 

1. INTRODUCTION 

Consider a communication channel of the following type: Once each 
second a real number may be chosen at the transmitting point. This 
number is transmitted to the receiving point but is perturbed by an 
additive gaussian noise, so that the ith real number, s, , is received as 
Si + Xi . The Xi are assumed independent gaussian random variables all 
with the same variance N. 

A code word of length n for such a channel is a sequence of n real 
numbers {s x , s 2 , • ■ ■ , s„). This may be thought of geometrically as a 
point in n-dimensional Euclidean space. The effect of noise is then to 
move this point to a nearby point according to a spherical gaussian 
distribution. 

A block code of length n with M words is a mapping of the integers 1 , 

2, • • • , M into a set of M code words Wi , w 2 , • ■ • , w M (not necessarily 
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all distinct). Thus, geometrically, a block code consists of a collection 
of M (or less) points with associated integers. It may be thought of as 
a way of transmitting an integer from 1 to M to the receiving point (by 
sending the corresponding code word). A decoding system for such a code 
is a partitioning of the ^-dimensional space into M subsets correspond- 
ing to the integers from 1 to M. This is a way of deciding, at the receiv- 
ing point, on the transmitted integer. If the received signal is in subset 
Si , the transmitted message is taken to be integer i. 

We shall assume throughout that all integers from 1 to M occur as 
messages with equal probability 1/il/. There is, then, for a given code 
and decoding system, a definite probability of error for transmitting a 
message. This is given by 

i M 
P = —Y P 
c M h "' 

where P e i is the probability, if code word Wi is sent, that it will be de- 
coded as an integer other than i. P e .i is, of course, the total probability 
under the gaussian distribution, centered on Wi in the region comple- 
mentary to Si . 

An optimal decoding system for a code is one which minimizes the 
probability of error for the code. Since the gaussian density is monotone 
decreasing with distance, an optimal decoding system for a given code 
is one which decodes any received signal as the integer corresponding 
to the geometrically nearest code word. If there are several code words 
at the same minimal distance, any of these may be used without affect- 
ing the probability of error. A decoding system of this sort is called mini- 
mum distance decoding or maximum likelihood decoding. It results in a 
partitioning of the n-dimensional space into n-dimensional polyhedra, 
or polytopes, around the different signal points, each polyhedron bounded 
by a finite number (not more than M — 1) of (n — l)-dimensional hy- 
perplanes. 

We are interested in the problem of finding good codes, that is, plac- 
ing M points in such a way as to minimize the probability of error P e . 
If there were no conditions on the code words, it is evident that the 
probability of error could be made as small as desired for any 71/, n and 
N by placing the code words at sufficiently widely separated points in 
the n space. In normal applications, however, there will be limitations 
on the choice of code words that prevent this type of solution. An inter- 
esting case that has been considered in the past is that of placing some 
kind of average power limitation on the code words; the distance of the 
points from the origin should not be too great. We may define three 
different possible limitations of this sort: 
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i. All code words are required to have exactly ttie same power P or the 
same distance from the origin. Thus, we are required to choose for code 
words points lying on the surface of a sphere of radius \/?iP. 

ii. All code words have power P or less. Here all code words are re- 
quired to lie interior to or on the surface of a sphere of radius \/nP. 

iii. The average power of all code words is P or less. Here, individual 
code words may have a greater squared distance than nP but the aver- 
age of the set of squared distances cannot exceed nP. 

These three cases lead to quite similar results, as we shall see. The 
first condition is simpler and leads to somewhat sharper conclusions — 
we shall first analyze this case and use these results for the other two 
conditions. Therefore, until the contrary is stated, we assume all code words 
to lie on the sphere of radius \/nP. 

Our first problem is to estimate, as well as possible, the probability 
of error P e (M, n, \/P/N ) for the best code of length n containing M 
words each of power P and perturbed by noise of variance .V. This mini- 
mal or optimal probability of error we denote by P« pt {M, n, s/P/N). It. 
is clear that, for fixed M, n, P e0P t will be a function only of the quotient 
A = \/ P/N by change of scale in the geometrical picture. We shall ob- 
tain upper and lower bounds on P e op t of several different types. Over an 
important range of values these bounds are reasonably close together, 
giving good estimates of P (!OI »t . Some calculated values and curves are 
given and the bounds are used to develop other bounds for the second 
and third type conditions on the code words. 

The geometrical approach we use is akin to that previously used by 
the author 1 but carried here to a numerical conclusion. The problem is 
also close to that studied by Rice, who obtained an estimate similar to 
but not as sharp as one of our upper bounds. The work here is also 
analogous to bounds given by Elias for the binary symmetric and binary 
erasure channels, and related to bounds for the general discrete memory- 
less channel given by the author. 

In a general way, our bounds, both upper and lower, vary exponen- 
tially with n for a fixed signaling rate, R, and fixed P/N. In fact, they 
all can be put [letting R = (1/n) log M, so that R is the transmitting 
rate for the code] in the form 

«-««"«■>, (1) 

where E(R) is a suitable function of R (and of P/N, which we think 
of as a fixed parameter). [In (1), o(n) is a term of order less than n; as 
n — > oo it becomes small relative to E(R)n.] 

Thus, for large n, the logarithm of the bound increases linearly with 
n or, more precisely, the ratio of this logarithm to n approaches a con- 
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stant E{R). This quantity E{R) gives a crude measure of how rapidly 
the probability of error approaches zero. We will call this type of quan- 
tity a reliability. More precisely, we may define the reliability for a 
channel as follows: 

E{R) = lim sup - - \ogP eopt (R,n), (2) 

n-*oo fl 

where P e0P t (R,n) is the optimal probability of error for codes of rate R 
and length n. We will find that our bounds determine E{R) exactly over 
an important range of rates, from a certain critical rate R c up to channel 
capacity. Between zero and R c , E is not exactly determined by our 
bounds, but lies within a not too wide range. 

In connection with the reliability E, it may be noted that, in (1) 
above, knowledge of E(R) and n does not closely determine the proba- 
bility of error, even when n is large; the term o(n) can cause a large 
and, in fact, increasing multiplier. On the other hand, given a desired 
probability of error and E(R), the necessary value of the code length n 
will be sharply determined when n is large; in fact, n will be asymptotic 
to -(1/E) log Pc . This inverse problem is perhaps the more natural 
one in applications: given a required level of probability of error, how 
long must the code be? 

The type of channel we are studying here is, of course, closely related 
to a band-limited channel (W cycles per second wide) perturbed by 
white gaussian noise. In a sense, such a band-limited channel can be 
thought of as having 2W coordinates per second, each independently 
perturbed by a gaussian variable. However, such an identification must 
be treated with care, since to control these degrees of freedom physically 
and stay strictly within the bandwidth would require an infinite delay. 

It is possible to stay very closely within a bandwidth W with a large 
but finite delay T, for example, by using (sin x)/x pulses with one tail 
deleted T from the maximum point. This deletion causes a spill-over 
outside the band of not more than the energy of the deleted part, an 
amount less than \/T for the unit (sin x)/x case. By making T large, 
we can approach the situation of staying within the allotted bandwidth 
and also, for example, approach zero probability of error at signaling 
rates close to channel capacity. 

However, for the problems we are studying here, delay as related to 
probability of error is of fundamental importance and, in applications 
of our results to such band-limited channels, the additional delay in- 
volved in staying closely within the allotted channel must be remem- 
bered. This is the reason for defining the channel as we have above. 
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II. SUMMARY 

In this section we summarize briefly the main results obtained in the 
paper, both for easy reference and for readers who may be interested 
in the results without wishing to work through the detailed analysis. It 
might be said that the algebra involved is in several places unusually 
tedious. 

We use the following notations: 

P = signal power (each code word is on the surface of 

a sphere of radius s/nP) ; 
N = noise power (variance N in each dimension); 
A = -\/P/N = signal-to-noise "amplitude" ratio; 
n = number of dimensions or block length of code; 
M = number of code words; 
R = (1/n) log M = signaling rate for a code (natural 

units) ; 
C = \ log (P + N)/N = \ log (A 2 + 1) = channel 

capacity (per degree of freedom) ; 
= variable for half -angle of cones appearing in the 
geometrical problem which follows; 
fl(0) = solid angle in n space of a cone of half-angle 0, or 
area of unit n sphere cut out by the cone; 
do = cot _1 -4 = cone angle relating to channel capacity; 
di — cone angle such that the solid angle fi(0i) of this 
cone is (l/il/)fl(x), [the solid angle of a sphere is 
0(ir)]; thus, 0i is a cone angle related to the rate 

R; 

G = G(d) = h(A cos d + VA 2 cos 2 6 + 4), a quan- 
tity which appears often in the formulas; 

6 C - the solution of 2 cos d c - AG(O e ) sin 2 d c = (this 
critical angle is important in that the nature of 
the bounds change according as 0i > 0„ or 0i < C ) ; 
Q(0) = Q(6, A, n) = probability of a point X in n space, 
at distance A\/n from the origin, being moved 
outside a circular cone of half-angle with vertex 
at the origin and axis OX (the perturbation is 
assumed spherical gaussian with unit variance in 
all dimensions) ; 
E L {6) = A 2 /2 — \AG cos - log (G sin 0), an exponent 
appearing in our bounds; 
P e opt (n, R, A ) = Probability of error for the best code of length n, 
signal-to-noise ratio A and rate R; 
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<t>(X) = normal distribution with zero mean and unit vari- 
ance. 
The results of the paper will now be summarized. P e opt can be bounded 
as follows: 

QM ^ P e o Pt ^ QM - [ ^ dQ(e). (3) 

[Here dQ(d) is negative, so the right additional term is positive.] These 
bounds can be written in terms of rather complex integrals. To obtain 
more insight into their behavior, we obtain, in the first place, asymptotic 
expressions for these bounds when n is large and, in the second place, 
cruder bounds which, however, are expressed in terms of elementary 
functions without integrals. 

The asymptotic lower bound is (asymptotically correct as n — » °o ) 



f)(ft \ ~ — = = o — p - E L(0i)n 

HKU Vnw G \/l + G* sin X (cos X - AG sin 2 d x ) 



U) 



o^ 6 -^ Ofc>* 

The asymptotic upper bound is 

Q{6i) ~ I m) W) V^ C \ 2 cosd! -AG Bin' h) ' (5) 

This formula is valid for O < 0i < C . In this range the upper and lower 
asymptotic bounds differ only by the factor in parentheses independent 
of n. Thus, asymptotically, the probability of error is determined by these 
relations to within a multiplying factor depending on the rate. For rates 
near channel capacity (0i near O ) the factor is just a little over unity; 
the bounds are close together. For lower rates near R c (corresponding 
to do), the factor becomes large. For X > C the upper bound asymptote 
is 



1 

e 



cos C sin 3 C G(6 C ) vV/?"(0 c )[l + G(0c)] 2 



-n[E L (6 c )-R) /g\ 



In addition to the asymptotic bound, we also obtain firm bounds, 
valid for all n, but poorer than the asymptotic bounds when n is large. 
The firm lower bound is 

p > 1 yn - 1 e -E^epn /-x 

e = Qn(A + l)3 e u+i> 2 /2 • w/ 
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It may be seen that this is equal to the asymptotic bound multiplied by 
a factor essentially independent of n. The firm upper bound {valid if 
the maximum of G" (sin d)~"~' exp [— (n/2)(A 2 — AG cos 0)] in the 
range to 0i occurs at 0i| is 

P.op. ^ 0i V^ e m CT(di) sin ft" -2 exp l'± ( -A 2 + AG cos ft)l 

(8) 
1+ * 



nft min [A, AG(ft) sin ft - cot ft] j' 

For rates near channel capacity, the upper and lower asymptotic 
bounds are both approximately the same, giving, where n is large and 
C — R small (but positive): 

where * is the normal distribution with unit variance. 

To relate the angle 0i in the above formulas to the rate R, inequalities 
are found: 



10) 



r^ + iWift)"- 1 , . 

V2 7 , \ ( 1 - - 1 tan 2 ft ) <: e— 

UT I — 1 7T COS ft 

~ _/(n + l)\ 1/2 7' 
nr I - — j 7r cos d x 

Asymptotically, it follows that: 

c'" H ~ — gl (11) 

\/2irra sin 0i cos ft ' 

For low rates (particularly R < R r ), the above bounds diverge and 
give less information. Two different arguments lead to other bounds 
useful at low rates. The low rate upper bound is: 

P < n[*-(X»A»>/4| / 19 v 

where X satisfies R = [1 - (l/«)] log (sin 2 sin -1 X/-\/2). Note that 



e opt = 
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as R — » 0, X — > 1 and the upper bound is approximately 

1 -nA*/4 

Ay/mi 
The low rale lower bound may be written 

For M large, this bound is close to $*(— Ay/n/2) and, if n is large, 
this is asymptotic to \/(A\Arn) e~"' 42/4 . Thus, for rates close to zero and 
large n we again have a situation where the bounds are close together 
and give a sharp evaluation of P e 0P t . 

With codes of rate R ^ C + e, where e is fixed and positive, P e 0P t 
approaches unity as the code length n increases. 

III. THE LOWER BOUND BY THE "SPHERE-PACKING" ARGUMENT 

Suppose we have a code with M points each at distance \/nP from 
the origin in n space. Since any two words are at equal distance from 
the origin, the n — 1 hyperplane which bisects the connecting line passes 
through the origin. Thus, all of the hyperplanes which determine the 
polyhedra surrounding these points (for the optimal decoding system) 
pass through the origin. These polyhedra, therefore, are pyramids with 
apexes at the origin. The probability of error for the code is 

where P e , is the probability, if code word i is used, that it will be carried 
by the noise outside the pyramid around the ith word. The probability 
of being correct is 

1 M I It 

M i=i M ,=i 

that is, the average probability of a code word being moved to a point 
within its own pyramid. 

Let the ith pyramid have a solid angle ft, (that is, 12,- is the area cut 
out by the pyramid on the unit n-dimensional spherical surface). Con- 
sider, for comparison, a right circular n-dimensional cone with the same 
solid angle ft,- and having a code word on its axis at distance y/nP. We 
assert that the 'probability of this comparison point being moved to within 
its cone is greater than that of w t being moved to within its pyramid. This 
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is because of the monotone decreasing probability density with distance 
from the code word. The pyramid can be deformed into the cone by 
moving small conical elements from far distances to nearer distances, 
this movement continually increasing probability. This is suggested for 
a three-dimensional case in Fig. 1. Moving small conical elements from 
outside the cone to inside it increases probability, since the probability 
density is greater inside the cone than outside. Formally, this follows 
by integrating the probability density over the region 7?! in the cone 
but not in the pyramid, and in the region R 2 in the pyramid but not in 
the cone. The first is greater than the solid angle 12 of Ri times the 
density at the edge of the cone. The value for the pyramid is less than 
the same quantity. 

We have, then, a bound on the probability of error P e for a given 
code: 

^e^ZQW, (H) 

where 12, is the solid angle for the ith. pyramid, and Q*(0) is the proba- 
bility of a point being carried outside a surrounding cone of solid angle 
12. It is also true that 

£ ^ = no , 

the solid angle of an n sphere, since the original pyramids corresponded 
to a partitioning of the sphere. Now, using again the property that the 
density decreases with distance, it follows that Q*(£2) is a convex function 
of 12. Then we may further simplify this bound by replacing each 12, by 

CONE 
ELEMENT IN: 




Pig. 1 — Pyramid deformed into cone by moving small conical elements from 
far to nearer distances, 
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the average Sk/M. In fact, 

and hence 

It is more convenient to work in terms of the half-cone angle 6 rather 
than solid angles Q. We define Q(6) to be the probability of being carried 
outside a cone of half-angle 6. Then, if 0i corresponds to the cone of 
solid angle I2o/il/, the bound above may be written 

Pe ^ QW. (15) 

This is our fundamental lower bound for P E . It still needs translation 
into terms of P, N, M and n, and estimation in terms of simple func- 
tions. 

It may be noted that this bound is exactly the probability of error 
that would occur if it were possible to subdivide the space into M con- 
gruent cones, one for each code word, and place the code words on the 
axes of these cones. It is, of course, very plausible intuitively that any 
actual code would have a higher probability of error than would that 
with such a conical partitioning. Such a partitioning clearly is possible 
only for n = 1 or 2, if M > 2. 

The lower bound Q(0i) can be evaluated in terms of a distribution 
familiar to statisticians as the noncentral ^-distribution. 5 The noncentral 
t may be thought of as the probability that the ratio of a random vari- 
able (z + 5) to the root mean square of / other random variables 



j/}z« 



does not exceed /, where all variates Xi and z are gaussian and independ- 
ent with mean zero and unit variance and 5 is a constant. Thus, denot- 
ing it by P(f, 8, t), we have 

p(M() = pr {7jfe4 (]fi) 

In terms of our geometrical picture, this amounts to a spherical gaussian 
distribution with unit variance about a point 5 from the origin in / + 1 
space. The probability P(f, 8, t) is the probability of being outside a 
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cone from the origin having the line segment to the center of the dis- 
tribution as axis. The cotangent of the half-cone angle d is t/\/f. Thus 
the probability Q(9) is seen to be given by 



Q(o) 



= I J (n - 1 , j/^, Vn - 1 cot d). (17) 



The noncentral /-distribution does not appear to have been very exten- 
sively tabled. Johnson and Welch 5 give some tables, but they are aimed 
at other types of application and are inconvenient for the purpose at 
hand. Further, they do not go to large values of n. We therefore will 
estimate this lower bound by developing an asymptotic formula for the 
cumulative distribution Q(d) and also the density distribution dQ/dd. 
First, however, we will find an upper bound on P e0P t in terms of the 
same distribution Q(6). 

IV. UPPER BOUX0 BY A RANDOM CODE METHOD 

The upper bound for P r op , will be found by using an argument based 
on random codes. Consider the ensemble of codes obtained by placing 
M points randomly on the surface of a sphere of radius \/nP. More 
precisely, each point is placed independently of all others with probabil- 
ity measure proportional to surface area or, equivalently, to solid angle. 
Each of the codes in the ensemble is to be decoded by the minimum 
distance process. We wish to compute the average probability of error for 
this ensemble of codes. 

Because of the symmetry of the code points, the probability of error 
averaged over the ensemble will be equal to M times the average proba- 
bility of error due to any particular code point, for example, code point 
1. This may be computed as follows. The probability of message number 
1 being transmitted is 1/M. The differential probability that it will 
be displaced by the noise into the region between a cone of half-angle 
and one of half-angle + dd (these cones having vertex at the origin 
and axis out to code word 1) is —dQ{6). [Recall that Q(d) was defined 
as the probability that noise would carry a point outside the cone of 
angle with axis through the signal point.] Now consider the cone of 
half-angle surrounding such a received point (not the cone about the 
message point just described). If this cone is empty of signal points, 
the received word will be decoded correctly as message 1. If it is not 
empty, other points will be nearer and the received signal will be incor- 
rectly decoded. (The probability of two or more points at exactly the 
same distance is readily seen to be zero and may be ignored.) 
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The probability in the ensemble of codes of the cone of half -angle 
being empty is easily calculated. The probability that any particular 
code word, say code word 2 or code word 3, etc. is in the cone is given 
by Q(0)/fl(ir), the ratio of the solid angle in the cone to the total solid 
angle. The probability a particular word is not in the cone is 1— 0(0)/ 
Q(t). The probability that all M — 1 other words are not in the cone 
is [1 — £1(6) /&{.v)] u ~ l since these are, in the ensemble of codes, placed 
independently. The probability of error, then, contributed by situations 
where the point 1 is displaced by an angle from to + dd is given by 
-(l/il/)|l - [1 - tl(e)/Q(ir)] M ~ 1 \dQ(e). The total average probabil- 
ity of error for all code words and all noise displacements is then given 

'---jcH'-gn™- (is) 

This is an exact formula for the average ■probability of error P er for our 
random ensemble of codes. Since this is an average of P e for particular 
codes, there must exist particular codes in the ensemble with at least 
this good a probability of error, and certainly then P e 0P t ^ Per . 

We may weaken this bound slightly but obtain a simpler formula for 
calculation as follows. Note first that {1 - [fl(0) /Q(r))"~ l ) ^ 1 and 
also, using the well-known inequality (1 - x) n ^ 1 - nx, we have 
{1 - [1 - Q(d)/Q(ir)] M ~ 1 } ^ (M - l)[Q(e)/Q(r)] ^ M[Q($)/Q(r)]. 
Now, break the integral into two parts, ^ ^ 0i and 0i ^ ^ t. In 
the first range, use the inequality just given and, in the second range, 
bound the expression in braces by 1. Thus, 

p cr ^ - f M \^]dQ(e) - fdQ(e), 
Per ^ - #s C Q(e)dQ(e) + Q(fc) • 

il{ir) Jo 
It is convenient to choose for 0i the same value as appeared in the lower 
bound; that is, the 0i such that fi(0i)/Q(ir) = 1/M — in other words, 
the 0i for which one expects one point within the 0i cone. The second 
term in (19) is then the same as the lower bound on P e0P t obtained 
previously. In fact, collecting these results, we have 

m r" 1 
Q(0i) ^ P.opt ^ QM - £-, Q(e)dQ(e), (20) 

where MS2(0 X ) = 8(t). These are our fundamental lower and upper bounds 

On P e opt • 

We now wish to evaluate and estimate fi(0) and Q(0). 
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V. FORMULAS FOR RATE R AS A FUNCTION OF THE CONE ANGLE 9 

Our hounds on probability of error involve the code angle 0i such that 
the solid angle of the cone is 1/M = e~ nR times the full solid angle of a 
sphere. To relate these quantities more explicitly we calculate the solid 
angle of a cone in n dimensions with half-angle 0. In Fig. 2 this means 
calculating the (n — 1) -dimensional area of the cap cut out by the cone 
on the unit sphere. This is obtained by summing the contributions due 
to ring-shaped elements of area (spherical surfaces in n — 1 dimensions 




Fig. 2 — Cap cut out by the cone on the unit sphere. 

of radius sin 6 and of incremental width dd). Thus, the total area of the 
cap is given by 



Ofo) = {n Ul / (rinff)-*d». 



(21) 



Here we used the formula for the surface S»(r) of a sphere of radius r 
in n dimensions, S n (r) = nir n ' 2 r n ~ l /T(n/2 + 1). 

To obtain simple inequalities and asymptotic expressions for ft(0i), 
make the change of variable in the integral x = shi 6, d8 — (1 — x 2 )~ ll2 dx. 
Let X\ = sin di and assume 6\ < t/2, so that X\ < 1. Using the mean 
value theorem we obtain 



(i -x-y 1,2 = d-.r 1 2 r 1/2 + 



(1 - a») 



3/2 



(x - Xi) , 



(22) 



where ^ a 5= .Ti . The term a(l — a 2 ) -3 2 must lie in the range from 
to £i(l — xi) z since this is a monotone increasing function. Hence 
we have the inequalities 



(1 " -rf)- 1 ' 2 + if " Xl lt * (1 - xT m S (1 - ftV 



:i - *i 2 ) 



3/2 



(23) 



£ s £ «i. 
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Note that x — Xi is negative, so the correction term on the left is of 
the right sign. If we use these in the integral for ft(0i) we obtain 

( - - i) *":"' 2 r *-"- \a - *v- + |f^l 

/ n + l \ Jo L U - Zi 2 ) 3/2 J 

(n - l)7r (n_1)/2 C XI , - dx 
( n + 1 \ Jo VI ~ J-i 2 

/ , \ (n-])/2 r r»— 1 _ n+1 „, n+1 T 

(n - l)w _£i , xi Xi 

ATTl\ / -U - 1 »(1 - .it) (n - 1)(1 - .rr)J 

r ^-2-J Vi - xv 



r/.r 

(24) 



(25) 

( W _ i) T '»-»y-' 



r vHr) (n " 1} vT^S 
r (• + !)« ft V ri ' 

V ^ 7 (2C) 

r(^)cos, 

Therefore, as n — » », fl(ft) is asymptotic to the expression on the right. 
The surface of the unit n sphere is mc" n /Y(n/2 -f- 1), hence, 



(j+lVrinft)- 1 , 1 v 

-^ L ( i - 1 tan 2 e 1 ) ^ e-" 

r( 1 7T cos ^i 



/ir 



fi(ft) 

fl(ir) 



< 



r (l + (sin 9i) "~ l 



(27) 



•m 



Replacing the gamma functions by their asymptotic expressions, we ob- 

t.n i n 

^—- [, + 0(1)1. (28) 

sin 0i cos 0i |_ \n/ J 



tain 

— nR 

■\/2irn 



-«n sm ' 

e = 
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Thus e~ nR ~ sin" di/\/2^n sin 6 1 cos 0i and e~ R ~ sin 6 t . The some- 
what sharper expression for e~" must be used when attempting asymp- 
totic evaluations of P, , since P e is changed by a factor when 0i is changed 
by, for example, k/n. However, when only the reliability E is of inter- 
est, the simpler R ~ —log sin 0i may be used. 

VI. ASYMPTOTIC FORMULAS FOR Q(8) AND Q'(6) 

In Fig. 3, is the origin, S is a signal point and the plane of the figure 
is a plane section in the ^-dimensional space. The lines OA and OB 
represent a (circular) cone of angle 6 about OS (that is, the intersec- 
tion of this cone with the plane of the drawing.) The lines OA' and OB' 
correspond to a slightly larger cone of angle 6 + dB. We wish to estimate 
the probability —dQ„(d) of the signal point S being carried by noise 
into the region between these cones. From this, we will further calculate 
the probability Q„(d) of S being carried outside the 6 cone. What is 
desired in both cases is an asymptotic estimate — a simple formula whose 




Fig. 3 — Plane of cone of half-angle 0. 
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ratio to the true value approaches 1 as n, the number of dimensions, 
increases. 

The noise perturbs all coordinates normally and independently with 
variance 1. It produces a spherical gaussian distribution in the w-di- 
mensional space. The probability density of its moving the signal point 
a distance d is given by 



-d2/2 



(2,r)"« 



dV, 



(29) 



where dV is the element of volume. In Fig. 4 we wish to first calculate 
the probability density for the crosshatched ring-shaped region between 




Fig. 4 — Special value d . 

the two cones and between spheres about the origin of radius r and 
r + ^r. The distance of this ring from the signal point is given by the 
cosine law as 



d = (r 2 + A z n - 2riVw cos d) m . 



(30) 



The differential volume of the ring-shaped region is r dr dd times the sur- 
face of a sphere of radius r sin in (n - 1) -dimensional space; that is 



r dr dd 



(n- l)7r (n - 1)/2 (rsin0) 



vti-2 



m 



(31) 



Hence, the differential probability for the ring-shaped region is 
- (r 2 + A 2 n - 2rA\/n cos 0)' 



vfcr exp [■ 



r dr dd 



(VSr^'L 2 

' (n - l)7r ( "- 1)/2 (r sin fl)'- 2 ! (32) 

The differential probability — dQ of being carried between the two cones 
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is the integral of this expression from zero to infinity on dr: 
1 (n - 1) dd 



~ dQ = ** 



/•" |~- (r 2 + A ! » - 2rAV« cos 0)"] , . fl ,„_ 2 , 
/ exp - (r sin 9) r dr. 



(33) 



In the exponent we can think of An as A 2 n(smd + cos 2 6). The cos 2 
part then combines with the other terms to give a perfect square 

(/• — A-y/n cos 0) 2 

and the sin term can be taken outside the integral. Thus 

A-n sin 2 0" 



(n — 1) exp 
-dQ = 



(sin 0)"- 2 dd 



2 v*-r(-^-J (34) 

f 00 f— (r — A Vn cos0) 2 "| n _i , 

■y. exp . — 2 — j r *■• 

We can now direct our attention to estimating the integral, which we 
cull K. The integral can be expressed exactly as a finite, but complicated, 
sum involving normal distribution functions by a process of continued 
integration by parts. We are, however, interested in a simple formula 
giving the asymptotic behavior of the integral as n becomes infinite. 
This problem was essentially solved by David and Kruskal, 6 who prove 
the following asymptotic formula as a lemma: 

f z exp (-\z + z vVTTw) dz ~ v^ (-) exp (\?)T, (35) 

as v — > oo, w is fixed, T =[l -\-\ {\/w~ + 4 — w) 2 ] -1 ' 2 and 

z = Wv + 1 w + y/\( v + l)w 2 + v. 

This is proved by showing that the main contribution to the integral 
is essentially in the neighborhood of the point z where the integral is a 
maximum. Near this point, when v is large, the function behaves about 
as a normal distribution. 

The integral K in (3-4) that we wish to evaluate is, except for a multi- 
plying factor, of the form appearing in the lemma, with 

z = r, w = A cos 6, v = n — 1. 
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The integral then becomes 



A n cos 6 



K = exp ( - 



exp — I - + zA 



y/n cos 6 J 



dz 



( A 2 n cos 2 6\ ys-M"" 1 -, (z\ 

exp( -_jv^y Texpy. 



(36) 



We have 



2 = 1 -y^ A COS + \Z\llA 2 COS 2 + 71 — 1 

= y/n \\A cos 5 + J^- cos 2 + 1-^1 
- Vn \A cos + y ^ cos 2 



+ I 



(37) 



A* 



2« /I/ x cos 2 + 1 



= + ( - 2 

11- 



Letting 



we have 



z = 



G = \[A cos 6 + VA* cos 2 + 4], 



/A"" 1 _ /VnGV" 1 |\ _ I + /AT" 1 

W " VW L «G VA 2 cos 2 5 + 4 T \WJ 



(38) 



Also, 



exp - = exp \ nG 2 1 — 



1 



nGy/A* cos 2 + 4 
2(? 



+ 



©J 



^ exp ^"2V^cos^ + 4, 

- exp [** (1 + ' 4G C0S * } " VA* J* I + 4 ] ' 



39: 
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since, on squaring G, we find (f = 1 + AG cos 0. Collecting terms: 



+ 4 



- vra*-R - f cos ' 2 • + H AG cos ') (40) 

= T V^r «. ( "- ,)/2 G"- 1 C - n/:! exp ( - 5 A 2 cos 2 + ^ AG cos 0) 

since a little algebra shows that the terms 

1 G 



1 - 



CtVA°- cos 2 + 4 VA 2 cos 2 + 4 



in the exponential cancel to zero. The coefficient of the integral (34), 
using the asymptotic expression for T[(n + l)/2], is asymptotic to 

(n-l) C - (ai " i ' ,)u2 " )/2 sm0'-V +1) ' 2 

,n/2 • ( 41 ) 



2 -V;(i+i)' 



2tt 



Combining with the above and collecting terms (we find that T = 

G/VTTG~ 2 )'. 

dQ 
dd 

n - 1 1 r /a 2 \T 

~~ 7=" /r-r-?^ • 2 "a G sin * ex P ( - TT + MG cos ) . 
Virn VI + 6" sin [_ \ 2 /_ 

77ms ts ow desired asymptotic expression for the density dQ/dd. 

As we have arranged it, the coefficient increases essentially as y/n 
and there is another term of the form e~ EL{6)n , where 

EM = V ~ * AG cos 9 ~ lo s ( G sin ^) • 

It can be shown that if we use for the special value O = cot ~ l A (see 
Fig. 4) then E,.(8 ) = and also E' L (6 ) = 0. In fact, for this value 

1 / A 2 

G(0 O ) = \{A cos 0o + \/A- cos 2 0o + 4) = - f , 

/ A* ~\ 1 / A°- A 2 + 2 \ 

+ v iqn + 4 y = 2 Wi 2 TT + vftt; = csc " u - 



630 THE BELL SYSTEM TECHNICAL JOURNAL, MAY 1959 

Hence the two terms in the logarithm cancel. Also 

^ - iAG cos O = ^ - \A VZ^TT ^/JTjpi " °- 

So E L (0 O ) = 0. We also have 

C' 
E\(0) = \AG sin - \AG' cos - ^ - cot 0. (43) 

When evaluated, the term — G'/G simplifies, after considerable algebra, 

to 

A sin 



VA 2 cos 2 +^4" 
Substituting this and the other terms we obtain 
^ 2 A 3 cos 2 sin 

^ (A 2 c os 2 + 4) . A sing 

+ 4V l^H"4 Smg+ VA 2 cos 2 + 4 - C ° tg - 

Adding and collecting terms, this simplifies to 



(44) 



E'M = ^ (A cos 4- VA 2 cos 2 + 4 ) sin 5 - cot 



A 

2 

= AG sin - cot (45) 

^cotgflsin^ + ^sin^l/^ + ^-l]. 

Notice that the bracketed expression is a monotone increasing func- 
tion of (0 ^ ^ ir/2) ranging from -1 at — to oo at = r/2. 
Also, as mentioned above, at O , G = esc O and A = cot O , so E' L (0 O ) 
= 0. It follows that E' L (6) < forO ^ < o andE' L (0) > for O ^ 

< ir/2. 

From this, it follows that, in the range from some 0i to tt/2 with 0j > O , 
the minimum E L {0) will occur at the smallest value of in the range, 
that is, at 0i . The exponential appearing in our estimate of Q(0), 
namely, e ~ BLie)n , will have its maximum at 0i , for such a range. Indeed, 
for sufficiently large n, the maximum of the entire expression (45) must 
occur at 0i , since the effect of the n in the exponent will eventually 
dominate anything due to the coefficient. For, if the coefficient is called 
a(6) with 7/(0) = «(0) e- nEL{9) , then 

y'( 6 ) = e- nEl - m [-a{0)nE' L (0) + «'(*)], (46) 
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and, since a(0) > 0, when n is sufficiently large y'(6) will be negative 
and the only maximum will occur at 0i . In the neighborhood of 0i the 
function goes clown exponentially. 

We may now find an asymptotic formula for the integral 

Q(6) = f 2 a(d)e- nELm dd + QU/2) (47) 



by breaking the integral into two parts, 



pttl + n *" f*/3 

Q{e) = + +Q(t/2). 



(48) 



In the range of the first integral, (1 - e)a(B 1 ) £ a(0) ^ a(0i)(l + e), 
and e can be made as small as desired by taking n sufficiently large. This 
is because a(d) is continuous and non vanishing in the range. Also, using 
a Taylor's series expansion with remainder, 



,-«■*<•> = exp [_ nEL ( 6l ) _ n ( d _ oje'M 

^E\{d*)\ 



(49) 

(d-d l 

— n 



where 6* is the interval 6\ to 0. As n increases the maximum value of the 
remainder term is bounded by n(n/2)~ 4 3 E" max . , and consequently ap- 
proaches zero. Hence, our first integral is asymptotic to 

a(0i) / exp [-nEM) - n(d - d,)E' L (d x )] dd 

= - a(di) exp [-nE L (0i)] — „, , a . (50) 

a(e l )e- nEL(Bl) 

since, at large n, the upper limit term becomes small by comparison. 
The second integral from 0i + n~ w to w/2 can be dominated by the 
value of the integrand at 0i + /i _2/3 multiplied by the range 

tt/2 - (By + n~ 2 ' 3 ), 

(since the integrand is monotone decreasing for large n). The value at 
di + rT m is asymptotic, by the argument just given, to 

a(Bi) exp l-nE L {e x ) - n{n w ) E'M). 

This becomes small compared to the first integral [as does Q(ir/2) = 



032 THE BELL SYSTEM TECHNICAL JOURNAL, MAY 1959 

$(— A) in (47)] and, consequently, on substituting for a(9i) its value 
and writing 8 for 0i , we obtain as an asymptotic expression for Q(6): 

l j ["Gsinflexp (~4+ JAGcostfjl 

QW ~ V^Vl + G*su\8 (AG sin 2 6 - cos 6) (51) 



y ^e > e = cot _1 AY 



This expression gives an asymptotic lower bound for P eop t, obtained by 
evaluating Q(6) for the 0i such that MQ(6i) = Q(v). 

Incidentally, the asymptotic expression (51) can be translated into 
an asymptotic expression for the noncentral t cumulative distribution 
by substitution of variables 6 = cot" 1 (t/y/f) and n - 1 = /. This may 
be useful in other applications of the noncentral /.-distribution. 

VII. ASYMPTOTIC EXPRESSIONS FOR THE RANDOM CODE BOUND 

We now wish to find similar asymptotic expressions for the upper 
bound on P eo pt of (20) found by the random code method. Substituting 
the asymptotic expressions for dQ(8)dd and for fi(0)/fl(w) gives for an 
asymptotic upper bound the following: 

Q(fc)+W ^ J A/I 

_ (52) 

Vl + G°- sin 2 6 
Thus we need to estimate the integral 

/""' 1 

W = h cos 6 sin 3 \/l + G 2 

. exp U-^ + 1 a/E G cos + log G + 2 log sin j U. 

The situation is very similar to that in estimating Q(0). Let the coeffi- 
cient of n in the exponent be D. Note that D = -E L {8) + log sin 6. 
Hence its derivative reduces to 

^= -AG sin + 2cot0. (54) 

dd 
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dD/dd = has a unique root C , ^ C ^ x/2 for any fixed .4 > 0. 
This follows from the same argument used in connection with (45), the 
only difference being a factor of 2 in the right member. Thus, for 
6 < 0,. , dD/dd is positive and D is an increasing function of 0. Beyond 
this maximum, D is a decreasing function. 

We may now divide the problem of estimating the integral W into 
cases according to the relative size of 6 C and 0i . 

Case 1: di < ( . . 

In this case the maximum of the exponent within the range of integra- 
tion occurs at B\ . Consequently, when n is sufficiently large, the maxi- 
mum of the entire integrand occurs at 0i . The asymptotic value can be 
estimated exactly as we estimated Q(0) in a similar situation. The inte- 
gral is divided into two parts, a part from 0, — n _2/3 to 0i and a second 
part from to d\ — if 1 3 . In the first part the integrand behaves asymp- 
totically like: 



cos 0, sin 3 0! Vl + G-(G\) GXP 

+ log G(dx) 




-G(0i)cos0! 



2 log sin 0i 
- (0 - 0!)UG(0i) sin &i - 2cot0i][). 
This integrates asymptotically to 

cxp L I -^- + I a/~ G(dy) cos 0, + log G(6i) + 2 log sin oA 

cos 0, sinY vT+ 6' 2 (0,) [-AGM sin 0i + 2 cot x ]n 



(55: 



(56) 



The second integral becomes small in comparison to this, being domi- 
nated by an exponential with a larger negative exponent multiplied by 
the range 0i — ri~ . With the coefficient 



I 



ir\/n 



'6+0' 



n + 1 



and using the fact that 



(no 



634 THE BELL SYSTEM TECHNICAL JOURNAL, MAY 1959 

our dominant term approaches 

G sin 0i exp ( -— + \AG cos 0ijl ^ 

\/nr Vl + G 2 sin 0i(2 cos 0i - ^1G sin 2 0i) ' 

Combining this with the previously obtained asymptotic expression 
(51 ) for Q(0i) we obtain the following asymptotic expression for the upper 
bound on P e op t for 0i < C : 



/ __ cos 0i - AG sin 2 X \ 
\ 2 cos 0i - AG sin 2 0i/ 

|(? sin 0i exp (~ + |A(? cos 0ij 



(58) 



^Vl + 6 2 sin Bi(AG sin 2 0i - cos 0i) ' 



Since our lower bound was asymptotic to the same expression without 
the parenthesis in front, the two asymptotes differ only by the factor 



/ _ cos 0i - AG sin 2 0i \ 
\ 2 cos 0i - AG sin 2 0i/ 



independent of n. This factor increases as 0i increases from the value O , 
corresponding to channel capacity, to the critical value C , for which the 
denominator vanishes. Over this range the factor increases from 1 to oo . 
In other words, for large n, P e oP t is determined to within a factor. Fur- 
thermore, the percentage uncertainty due to this factor is smaller at 
rates closer to channel capacity, approaching zero as the rate approaches 
capacity. It is quite interesting that these seemingly weak bounds can 
work out to give such sharp information for certain ranges of the varia- 
bles. 

Case 2: 0i > C . 

For 0i in this range the previous argument does not hold, since the 
maximum of the exponent is not at the end of the range of integration 
but rather interior to it. This unique maximum occurs at C , the root of 
2 cos C - AG sin 2 C = 0. We divide the range of integration into three 
parts: to C - n~ 2l \ d c - n~ 2 ' 5 to C + n w and C + n m to 0. Pro- 
ceeding by very similar means, in the neighborhood of C the exponential 
behaves as 

exp (- n iEM + {d ~ ° 2 ***<••> + °^ 6 ~ *' )3] }) ■ 



ERROR PROBABILITY FOR CODES IX A GAUSSIAN CHANNEL 635 

The coefficient of the exponential approaches constancy in the small in- 
terval surrounding 6 C . Thus the integral (53) for this part is asymptotic 
to 

1 

cos 6 e sin 3 6 e y/\ + G 2 



I 



exp < — n 



E L {e c ) + {B J c)2 E" L {e c ) 



de (59) 



~«»,«»»,vik exp [ ~ nEM] vAy • 

The other two integrals become small by comparison when n is large, by 
essentially the same arguments as before. They may be dominated by 
the value of the integrand at the end of the range near 6,, multiplied by 
the range of integration. Altogether, then, the integral (52) is asymp- 
totic to 

1 e -nlE L <O c) - R ^ (6Q) 



y™ cos 6 C sin 3 6 C Vl + G 2 VE" L (6 C ) 

The other term in (52), namely, Q(d\), is asymptotically small com- 
pared to this, under the present case 6 > 6 C , since the coefficient of n 
in the exponent for Q(6) in (51) will be smaller. Thus, all told, the ran- 
dom code bound is asymptotic to 

' e -«E L{ e c) - R] (61) 



cos e c sin 3 d e <s/nrE" L (6 C )[1 + G(8 e ) 2 } 

for 8 > d c or for rates R < R c the rate corresponds to 6 e . 

Incidentally, the rate R e is very closely one-half bit less than channel 
capacity when A ^ 4, and approaches this exactly as A — » oo . For lower 
values of A the difference C — R c becomes smaller but the ratio C/R c — * 4 
as A -> 0. 

VIII. THE FIRM UPPER BOUND ON P e opt 

In this section we will find an upper bound, valid for all n, on the proba- 
bility of error by manipulation of the upper bound (20). We first find 
an upper bound on Q'(9). In Ref. 6 the integral (35) is transformed into 
z exp ( — 2^" + z\/v -\- 1 w) times the following integral (in their no- 
tation ) : 



V -/%,<„) exp {-*• + , [in (l + |)- |] 



h- 
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« 2 /2 



It is pointed out that the integrand here can be dominated by e 
This occurs in the paragraph in Ref. 6 containing Equation 2.6. There- 
fore, this integral can be dominated by \/2t, and our integral in (34) 
involved in dQ/dd is dominated as follows: 



r r (r - A Vncosfl) 2 ! „-i 

I. exp L 2 J r 



dr 



'zY -A z n 



z\ (z\ —An 2 QTT 

exp 1-1 exp — - — cos 6U 

f-\n-l /„\2 *2„ 



^W -1 (*x ~ An 2 « /o- 

^=1-1 exp l - I exp — - — cos v v 27r. 

We have 

z = \yjn (A cos + \/A 2 cos 2 + 4 - 4/n) ^ \Ai G. 
Replacing I by this larger quantity gives 

(t) exp V^~ " "2" cos 7 V2?r - 

We have, then, 
* b 2"'Vir^+i V e / (62) 



/nG 2 _ A"n » 
Replacing the gamma function by its Stirling expression 



•exp [ - - — - cos 2 6 ) -\Z2tt. 



v-T-j exp l~2- )^* 



(which is always too small), and replacing [1 + (l/n)] n/2 by \/2 (which 
is also too small) again increases the right member. After simplification, 
we get. 



_dQ < 



(n- l)(Gsin0) fl exp[7|) (-A* + 1 + AG cos 0)] 
''" VnGsin'ei/^exp^^?) (63) 



^ 

\/27rn G sin 2 
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Notice that this differs from the asymptotic expression (42) only by a 
factor 



c n Vl + CP < e w 
V'2G 



(since G ^ 1). A firm upper bound can now be placed on Q(6): 

" n dQ 



Qifii) = ( 



de + Q[ a - 



I) 



We use the upper bound above for dQ/dd in the integral. The coefficient 
of —ft in the exponent of e 

E L {0) = %{A l - AG cos 0) - log G sin 6 

is positive and monotone increasing with 6 for 6 > O , as we have seen 
previously. Its derivative is 

E' L (6) = AG sin - cot 6. 

As a function of 9 this curve is as shown in Fig. 5, either rising mono- 
tonically from — oo at 6 = to A at 6 = t/2, or with a single maxi- 
mum. In any case, the curve is concave downward. To show this analyti- 
cally, take the second derivative of E\ . This consists of a sum of negative 
terms. 



eLW 




Fig. 5 — El' {9) as a function of 0. 
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Returning to our upper bound on Q, the coefficient in (63) does not 
exceed 

-\/n 3/2 



sin 2 di 



replacing sin and G by sin 0i and 1, their minimum values in the 
range. We now wish to replace e~ n L by 

exp-n[£z,(0i) + (0 - hM 

If h is chosen equal to the minimum E' L (0), this replacement will in- 
crease the integral and therefore give an upper bound. From the be- 
havior of E' L (6) this minimum occurs at either 0i or tt/2. Thus, we may 
take h = min [A, AG(di) sin d x - cot 0i]. With this replacement the 
integral becomes a simple exponential and can be immediately inte- 
grated. 

The term Q(t/2) is, of course, 

If we continue the integral out to infinity instead of stopping at tt/2, the 
extra part added will more than cover Q(x/2). In fact, E l (t/2) = A /2, 
so the extra contribution is at least 

r- 3/2 



if we integrate 



An sin 2 0i \/%c 



/- 3/2 

\/1l e -A^nl2-n(e-6i)A 



sin 2 0i \/27r 

to °o instead of stopping at x/2. Since e 3/2 /sin 2 fc g 1, we may omit 
the Q(tt/2) term in place of the extra part of the integral. 
Consequently, we can bound Q(0i) as follows: 

e 3 ' 2 exp { (n/2) [AGM cos t - A 2 + 2 log G sin ft]} 
Q(0l) - " " V2x7i sin 2 0, min (.4, AGiOj sin X - cot 0,) ' 

In order to overbound P eoP t by (3) it is now necessary to overbound 
the term 

f" 1 fi(0) 



12(0, 



dQ(6). 
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This can be done by a process very similar to that just carried out for 
JdQ(6). First, we overbound fi(0)/Q(0i) using (21). We have 

Q(#) = /o (shl Xr ~* dX 



I (sm.rj 
Jo 

( (an*)" 

Jo 



dx 

dx 



f (sin a?)"" 2 da; + f (sin a:)" 

Jo Je 

[ (sin*)"- 2 

Jo 



dx 

cos x dx 



dx 



I (sin x) n 2 cos x dx + cos I (sin x) n 

Jo Jff 

f 6 

I (sin x) n cos x dx 

<, J J. 

I (sin x)"~ 2 cos x dx + / (sin a:)" -2 cos .t dx 

Jo Jfl 

and, finally, 

0(g) < (sing)"" 1 , 

n(ft) = (sin ft)- 1 ' K } 

Here the third line follows since the first integral in the denominator is 
reduced by the same factor as the numerator and the second integral is 
reduced more, since cos is decreasing. In the next line, the denominator 
is reduced still more by taking the cosine inside. 

Using this inequality and also the upper bound (63) on dQ/dd, we 
have 

h (1(d) 



fffi™* 



f" (sing)'-' { n - l)e*'*(GsmO)»e in,m - A2+Acoaeo) ,. , PR , 

I t~- — rr — ; ; o do Coo J 

Jo (sing,)"- 1 y/2*nG sin* 6 

( n _ 1 ) P 3I2 f«l 

V2im (sin 0i ) Jo 

Near the point X the integrand here behaves like an exponential when 
n is large (provided X < 0,.), and it should be possible to find a firm 



( n _ 1 \ P 3I2 r 0i 
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upper bound of the form 

fc -*L(»l)n 

Vn 

where k would not depend on n. This, however, leads to considerable 
complexity and we have settled for a cruder formulation as follows: 

The integrand may be bounded by its maximum values. If 0i < B r , 
the maximum of the integrand will occur at 0i , at least when n is large 
enough. In this case, the integral will certainly be bounded by 

The entire expression for P e „ pt may then be bounded by [adding in the 
bound (04) on Q(0i)] 



/- 3/2 a -B£(«|) f 1 

Pt = VKan 1 * I ** min ^' ' lG M sin * " co^ili 

It must be remembered that (07) is valid only for 0i < 0,. and if ?i is 
large enough to make the maximum of the integrand above occur at 0. 
For 0i > B e , bounds could also be constructed based on the maximum 
value of the integrand. 

IX. A FIRM LOWER BOUND ON P e opt 

In this section we wish to find a lower bound on P, oP t that is valid for 
all n. To do this we first find a lower bound on Q'{6) and from this find 
a lower bound on Q(0). The procedure is quite similar to that involved 
in finding the firm upper bound. 

In Ref. 0, the integral (35) above was reduced to the evaluation of 
the following integral (Equation 2.5 of Ref. 0) : 

- y-.) dy 



S fe*p{-^ + ,[.n(l + f)-f] 



dy 



> V2tt _ yV 
= 2V2 ~ 2 
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Here we used the inequality 



y 



y 



II 



V z z 2z- 



for 



l>0, 



and also the fact that v/t ^ 1. This latter follows from Equation 2.3 of 
Ref. (> on dividing through by z l . 

Using this lower hound, we obtain from (34) 



(n - 1) sin" ' exp(— ^— -) , XB _, / _, x r 

-0 -pfe)^- < e8 > 



</0 



2 ^ r (»+i) 

Now z ^ \/n — 1 6' and 

r It) < 1— j c ^ exp _o7^+oJ 

and, using the fact that 

'u - l\ n/2 ^ 1 



> z 
3 



for 



we obtain 
dQ 

de - o 



</Q> _J 



n + 1, 

G exp 1 + 65T+IJ J sm ' 



n ^ 2, 



for w > 2 . 



(CD) 



This is our lower bound on dQ, 7/0. 

To obtain a lower bound on Q(d) we may use the same device as 
before — here, however, replacing the coefficient by its minimum value in 
the range and the exponent by —nE L (6\) — n(9 — 6\)E' L ,, UIX : 

E' L - AG sin - cot d 

^ AG 

S A(A + 1). 

Similarly, in the coefficient, G can be dominated by A + 1 and sin 2 by 
1. Thus, 



Q(0i) ^ 

tt/2 a/ 71 _ | e 3 /2 p -" K /-< fl l> c -»<«- fl lMU+l> 



f 

J6i 



6 \Z2t(A + 1) exp 



ru + i 

xp[— - 



_ — =* + o^). < 70 > 



6(n+ D_ 
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Integrating and observing that the term due to the ir/2 limit can be 
absorbed into the Q(ir/2) - erf A, we arrive at the lower bound: 

/ 7 3/2 -nE L (Bi) 

■y/n —lee 



Q(*i) ^ 



u + D 2 , i T (71) 



W2*n (A + I) 3 exp pA+lT + ^TT)\ 



X. BEHAVIOR NEAR CHANNEL CAPACITY 

As we have seen, near channel capacity the upper and lower asymp- 
totic bounds are substantially the same. If in the asymptotic lower 
bound (42) we form a Taylor expansion for near O , retaining terms 
up to (0 - 0o ) 2 , we will obtain an expression applying to the neighbor- 
hood of channel capacity. Another approach is to return to the original 
noncentral /-distribution and use its normal approximation which will 
be good near the mean (see Ref. 5). Either approach gives, in this neigh- 
borhood, the approximations [since E(6 ) = E'(6 Q ) = 0]: 

de y/*V2 + A* P L A> + 2 J 

r A2+1 .— I 

Q(0) * *[(0f t -e)^===V2nj t 

or, since near channel capacity, using e = sin 0, 
- 0o = A~\C - R) 

p.* (*, *, y 7 !) - * [vs^^Sr <* - c)] m 

= *[VP(P + 2N)^ (R - C) \ 

The reliability curve is approximated near C by 

^ * ^^ W " *?■ (74) 

It is interesting that Rice 2 makes estimates of the behavior of what 
amounts to a lower bound on the exponent E near channel capacity. 
His exponent, translated into our notation, is 

a poorer value than (74); that is, it will take a larger block length to 
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achieve the same probability of error. This difference is evidently due to 
the slight difference in the manner of construction of the random codes. 
Rice's codes are obtained by placing points according to an n-dimen- 
sional gaussian distribution, each coordinate having variance P. In our 
codes the points are placed at random on a sphere of precisely fixed ra- 
dius y/nP. These are very close to the same thing when n is large, 
since in Rice's situation the points will, with probability approaching 1, 
lie between the spheres of radii y/nP (1 — e) and \/nP (1 + 0, (any 
t > 0). However, we are dealing with very small probability events in 
any case when we are estimating probability of error, and the points 
within the sphere are sufficiently important to affect the exponent E. In 
other words, the Rice type of code is sufficient to give codes that will 
have a probability of error approaching zero at rates arbitrarily near 
channel capacity. However, they will not do so at as rapid a rate (even 
in the exponent) as can be achieved. To achieve the best possible E it 
is evidently necessary to avoid having too many of the code points in- 
terior to the \/nP sphere. 

At rates R greater than channel capacity we have 0i < 6 . Since the 
Q distribution approaches normality with mean at do and variance 
2n(A 2 + 1) 2 /(A 2 + 2), we will have QM approaching 1 with in- 
creasing n for any fixed rate greater than C. Indeed, even if the rate R 
varies but remains always greater than C (perhaps approaching it from 
above with increasing n), we will still have P COP t > 2 — * for any e > 
and sufficiently large n. 

XI. UPPER BOUND ON P e opt BY METHOD OF EXHAUSTION 

For low rates of transmission, where the upper and lower bounds di- 
verge widely, we may obtain better estimates by other methods. For 
very low rates of transmission, the main contribution to the probability 
of error can be shown to be due to the code points that are nearest to- 
gether and thus often confused with each other, rather than to the gen- 
eral average structure of the code. The important thing, at low rates, is 
to maximize the minimum distance between neighbors. Both the upper 
and lower bounds which we will derive for low rates are based on these 
considerations. 

We will first show that, for D ^ \/2 nP, it is possible to find at least 

/ _, D V~" 

M D = ( sin 2 sin - — ~r= J 

points on the surface of an n sphere of radius y/nP such that no pair 
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of them is .separated by a distance less than D. (If M D is not an integer, 
take the next larger integer.) The method used will be similar to one 
used by E. N. Gilbert for the binary symmetric channel. 

Select any point on the sphere's surface for the first point. Delete from 
the surface all points within D of the selected point. In Fig. G, x is the 
selected point and the area to be deleted is that cut out by the cone. 
This area is certainly less (if D ^ -\/2nP) than the area of the hemi- 
sphere of radius H shown and, even more so, less than the area of the 
sphere of radius H. If this deletion does not exhaust the original sphere, 
select any point from those remaining and delete the points within D 
of this new point. This again will not take away more area than that of 
a sphere of radius H . Continue in this manner until no points remain. 
Note that each point chosen is at least D from each preceding point. 
Hence all interpoint distances are at least D. Furthermore, this can be 
continued at least as many times as the ratio of the surface of a sphere 
of radius \/nP to that of a sphere of radius H, since each deletion takes 
away not more than this much surface area. This ratio is clearly 



By simple geometry in Fig. G, we see that H and D are related as fol- 
lows : 



H 



sin o = 



sin - = 



VnP' 
1) 



2 2\/ n P' 
Hence 

/— • • -i D 

II = V'/tP sm 2 sin ~ /== . 
2 v nP 

Substituting, we can place at least 

- T) \_ tn _ n 



(75) 



M D = ^sin 2 sin 2 ^_ 

points at distances at least D from each other, for any D ^ \/2nP. 

If we have M u points with minimum distance at least D, then the proba- 
bility of error ■with optimal decoding will be less than or equal to 

To show this we may add up pessimistically the probabilities of each 
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point being received as each other point. Thus the probability of point 
1 being moved closer to point 2 than to the original point 1 is not greater 
than *[— D/ (2\/N)\, that is, the probability of the point being moved 
in a certain direction at least D/2 (half the minimum separation). 
The contribution to errors due to this cause cannot, therefore, exceed 
(\/M !,)$[- D/(2y/N)], (the l/M D factor being the probability of mes- 
sage 1 being transmitted). A similar argument occurs for each (ordered) 
pair of points, a total of Mi,(M r > — 1) contributions of this kind. Con- 
sequently, the probability of error cannot exceed (i\F„ — \)<t>[ — D/ 
0-VAOJ or, more simply, M ,#>[- D / (2y/N )}. 
If we set 

/ D \-c»-o 



then the rate R (in natural units) is 



I) 



I (, -ljlog^sin2sin- 1 2V ^ 



r 



with 



P„ < e" B * 



^_ \ < «K \/2N -(D2/8N) 



(70) 



,2 VnJ - v d vV 

using the well-known upper bound *(— x) ^ (l/x\/2ir)e~* n . These are 



HEMISPHERE OF 
RADIUS H 




Fig. G — Geometry of sphere of radius y/nP. 
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parametric equations in terms of D. It is more convenient to let 

D = \\/2nP. 
We then have 

i (77) 

p < 1 n[R-(\2p)/(4AD] 



y™I 



The asymptotic reliability, that is, the coefficient of —n in the expo- 
nent of P e , is given by (X P/4N) — R. This approaches 

(sin 5 sin e ) — — — ti as n — + ». 

Thus our asymptotic Zower bound for reliability is (eliminating X) : 

E ^ (sin a sin" 1 e R f ^ ~ R- (78) 

As R — * the right-hand expression approaches P/(4N). 

This lower bound on the exponent is plotted in the curves in Section 
XIV and it may be seen to give more information at low rates than the 
random code bound. It is possible, however, to improve the random 
coding procedure by what we have called an "expurgating" process. It 
then becomes the equal of the bound just derived and, in fact, is some- 
what stronger over part of the range. We shall not go into this process 
in detail but only mention that the expurgating process consists of 
eliminating from the random code ensemble points which have too close 
neighbors, and working with the codes that then remain. 

XII. LOWER BOUND ON P e IN GAUSSLA.N CHANNEL BY MINIMUM DISTANCE 
ARGUMENT 

In a code of length n with M code words, let w, 8 (z = 1, 2, • ■ ■ , M ; 
8 — 1, 2, ■••,») be the sth coordinate of code word i. We are here as- 
suming an average power limitation P, so that 

4- f X>^-r- (79) 

nM i.s 

We also assume an independent gaussian noise of power N added to each 
coordinate. 
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We now calculate the average squared distance between all the 
M(M — l)/2 pairs of points in n-space corresponding to the M code 
words. The squared distance from word i to word j is 

8 

The average D 2 between all pairs will then be 

55 " M(M- 1) 5 (m " ~ m '- Y - 

Note that each distance is counted twice in the sum and also that the 
extraneous terms included in the sum, where i = j, contribute zero to it. 
Squaring the terms in the sum, 

S* = JTTTf 7\ (2 m is 2 -2EZ m.mj. + Z mj. 2 ) 

LVlylvl — L) i.j.t s i.j i.j.i 

(80) 

^ ittttt^ — r^ 2MPnM 
M(M - 1) 

- < 2nMP 

" - m - r 

where we obtain the third line by using the inequality on the average 
power (79) and by noting that the second term is necessarily non- 
positive. 

If the average squared distance between pairs of points is 

£(2nMP)/(M - 1), 

there must exist a pair of points for whose distance this inequality holds. 
Each point in this pair is used \/M of the time. The best detection for 
separating this pair (if no other points were present) would be by a 
hyperplane normal to and bisecting the joining line segment. Either 
point would then give rise to a probability of error equal to that of the 
noise carrying a point half this distance or more in a specified direction. 
We obtain, then, a contribution to the probability of error at least 



n / • • 4- • a- *• >> l . /2nMP 

— . Pr < noise in a certain direction ^ - A/ — r 



1 ~ I . . ..,.,• ^ 1 . / 2nMP \ 

M - lj 



nMP 



M I y (M - l)2N_' 
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This we may assign to the first of the two points in question, and the 
errors we have counted are those when this message is sent and is re- 
ceived closer to the second message (and should therefore be detected 
as the second or some other message). 

Now delete this first message from the set of code points and consider 
the remaining M — 1 points. By the same argument there must exist 
among these a pair whose distance is less than or equal to 



2nP(M - 1) 



(M - 2) 

This pair leads to a contribution to probability of error, due to the first 
of these being displaced until nearer the second, of an amount 



XI L V (M - 2)2A/J 



M 

This same argument is continued, deleting points and adding contribu- 
tions to the error, until only two points are left. Thus we obtain a lower 
bound on P cop t as follows: 

^■^K-i/iA) + *(-yfK) 

(81) 

+ -K-1/S0]- 

To simplify this bound somewhat, one may take only the first il//2 terms 
[or (M + l)/2 if M is odd]. Since they are decreasing, each term would 
be reduced by replacing it with the last term taken. Thus we may reduce 
the bound by these operations and obtain 

p -*H-v / j™D- (82) 

For any rate R > 0, as n increases the term M/(M — 2) approaches 1 
and the bound, then, behaves about as 



Hva- 



This is asymptotic to 



! VT 






It follows that the reliability E ^ P/UN) = A 2 /4. This is the same 
value as the lower bound for E when R — * 0. 
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XIII. ERROR BOUNDS AND OTHER CONDITIONS ON THE SIGNAL POINTS 

Up to now we have (except in the last .section) assumed that all sig- 
nal points were required to lie on the surface of the sphere, i.e., have a 
mean square value sfnP. Consider now the problem of estimating 
P'. 0P t(M, n, y/P/N), where the signal points are only required to lie 
on or within the spherical surface. Clearly, since this relaxes the condi- 
tions on the code, it can only improve, i.e., decrease the probability of 
error for the best code. Thus P' e op t ^ Pc op t . 

On the other hand, we will show that 



V 



M, n, 



8 **-(*•» +i 'i/E 



(83: 



In fact, suppose we have a code of length n, all points on or within the 
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Fig. 7 — Curves showing E L vs. R for .4 = J, \ and h. 
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n sphere. To each code word add a further coordinate of such value that 
in the ?i -f- 1 space the point thus formed lies exactly on the n + 1 sphere 
surface. If the first n coordinates of a point have values x t , x 2 , ■ • • , x„ 
with 

the added coordinate will have the value 



•r„+i = 



(n + l)P -£*/■ 



This gives a derived code of the first type (all points on the n + 1 
sphere surface) with M words of length n + 1 at signal-to-noise ratio 
P/N. The probability of error for the given code is at least as great as 
that of the derived code, since the added coordinate can only improve 
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Fig. 9 — Curves showing E L vs. different values of R for A = 3. 



the decoding process. One might, for example, decode ignoring the last 
coordinate and then have the same probability of error. Using it in the 
best way would, in general, improve the situation. 

The probability of error for the derived code of length n + 1 must be 
greater than or equal to that of the optimal code of the length n + 1 
with all points on the surface. Consequently we have (83). Since 
Peo P t(M, n, y/P/N) varies essentially exponentially with n when n is 
large, the effect of replacing n by n 4- 1 is essentially that of a constant 
multiplier. Thus, our upper bounds on P eop t are not changed and our 
lower bounds are multiplied by a quantity which does not depend much 
on n when n is large. The asymptotic reliability curves consequently 
will be the same. Thus the E curves we have plotted may be applied in 
either case. 

Now consider the third type of condition on the points, namely, that 
the average squared distance from the origin of the set of points be less 
than or equal to nP. This again is a weakening of the previous conditions 
and hence the optimal probability of error, P" c opt , is less than or equal 
to that of the previous cases: 

P\ opt (m, n, £) fi P' e opt (m, n, |) g P e o P t \M, n, £) . (84) 
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Fig. 10 — Curves showing E L vs. different values of It for A = 4. 



Our upper bounds on probability of error (and, consequently, lower 
bounds on reliability) can be used as they stand. 

Lower bounds on P" e o P t may be obtained as follows. If we have M 
points whose mean square distance from the origin does not exceed nP, 
then for any a(0 < a ^ 1) at least aM of the points are within a sphere 
of squared radius nP/(l - a). [For, if more than (1 - a)M of them 
were outside the sphere, these alone would contribute more than 

(1 - a)MnP/{\ - a) 

to the total squared distance, and the mean would then necessarily be 
greater than nP.) Given an optimal code under the third condition, we 
can construct from it, by taking aM points within the sphere of radius 
■y/nP/l — oc, a code satisfying the second condition with this smaller 
number of points and larger radius. The probability of error for the new 
code cannot exceed l/a times that of the original code. (Each new code 
word is used l/a times as much; when used, its probability of error is at 
least as good as previously. ) Thus : 
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P'W (M, n, |/J) £ i P\ opt («* , n, j/ Tr ^) 



= ^ p - pt ( aM ' w + 1 'y o^)Ary 



XIV. CURVES FOR ASYMPTOTIC BOUNDS 



Curves have been calculated to facilitate evaluation of the exponents 
in these asymptotic hounds. The basic curves range over values of 
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Fig. 11 — Curves showing El vs. different values of R for A = 8 and 10. 
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Fig. 12 — Channel capacity, C, and critical rate, B c , as functions of 0. 

A = |, h h, 1» 2, 3, 4, 8, 16. Figs. 7 through 11 give the coefficients 
of n and E L as functions of the rate R. Since E L strictly is a function of 
8, and the relation between 8 and R depends somewhat on n, a number 
of slightly different R scales are required at the bottom of the curve. 
This, however, was considered a better means of presenting the data 
than the use of auxiliary curves to relate R and 8. These same curves 
give the coefficient of n in the upper bounds (the straight line part to- 
gether with the curve to the right of the straight line segment). The point 
of tangency is the critical R (or critical 8). In other words, the curve and 
the curve plus straight line, read against the n = °o scale, give upper 
and lower bounds on the reliability measure. The upper and lower bounds 
on E for low R are also included in these curves. The upper bound is the 
horizontal line segment running out from R = 0, E = A 2 /<i. The lower 
bound is the curved line running down from this point to the tangent 
line. Thus, the reliability E lies in the four-sided figure defined by these 
lines to the left of R c . It is equal to the curve to the right of R c . Fig. 12 
gives channel capacity C and the critical rate R c as functions of 8. For 
A very small, the E L (R) curve approaches a limiting form. In fact, if 
$ = {it/ 2) — e, with e small, to a close approximation by obvious expan- 
sions we find 



A 2 ? 

E L {R) +^-Ae + *- 



and 



B *V 
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Fig. 13 — Plots of EdtD/A* against R/A*. 
Eliminating e, we obtain 
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Fig. 13 plots E L (R)/A- against R/A 2 . 
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